使用 Florence2 語言模型來看圖產生說明

下載外掛節點

在 ComfyUI 裡面，開啟 Manager > Custom Nodes Manager。在左上角的搜尋欄位輸入想要的外掛名稱，安裝即可

搜尋 Florence2
安裝 ComfyUI-Florence2 這個外掛

Florence2 是 Microsoft 開發出來的視覺語言模型 VLM。用來處理影像、物件及相關文字(OCR) 的任務

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks.

我們可以在 ComfyUi 裡面套用 Florence2 這個模型來幫我們辨識圖片，產生這個圖片的文字描述，當作我們繪圖的 Prompt 來進行繪圖。

記住，這樣的方式是 Text 2 Image 文字生成圖片；而不是 Image 2 Image 圖片生成圖片。

我們需要的是 caption 內容說明，不同的模型、不同的 task 任務，說明的內容都不相同。
以森林柯基這張圖片為例。以下是 Gemma 3 LLM 自動翻譯作爲資料參考：

說明：一隻狗站在森林中央。
詳細說明：圖片顯示一隻棕白色的彭布魯克威爾士柯基犬站在森林中央，周圍環繞著樹木。
更詳細說明：圖片是一張狗站在森林裡的的照片。這隻狗是彭布魯克威爾士柯基犬，有棕白色的毛皮和一條黑色的項圈。它站在落葉的地面上，狗的兩側都有樹木。背景是模糊的，但似乎是一個有陽光穿過樹木的霧氣森林。圖片的整體氛圍是平和而寧靜的。

說明：一隻棕白色的狗站在森林裡。
詳細說明：圖片顯示一隻彭布魯克威爾士柯基犬站在黑暗的森林中央。這隻狗是白色和棕色的，脖子上戴著一條黑色的腰帶。地面上覆蓋著乾枯的落葉，樹木被濃霧籠罩，營造出一種神秘的氛圍。
更詳細說明：圖片是一張彭布魯克威爾士柯基犬站在茂密森林中央的特寫肖像。這隻狗面向鏡頭，直接看著鏡頭，嘴巴微微張開，好像在喘氣或哈氣。陽光穿過樹木，照亮狗的毛皮。背景中的樹木又高又細，樹幹伸向天空。地面上覆蓋著落葉，營造出一種黑暗而怪異的氛圍。圖片的整體氛圍是平和而寧靜的。

說明：一隻彭布魯克威爾士柯基犬站在日落時分的森林裡。
詳細說明：一張彭布魯克威爾士柯基犬站在森林小徑上，看著鏡頭，表情快樂的照片。這隻狗有棕白色的毛皮和一條黑色的項圈。背景是一片茂密的森林，陽光穿過樹木，照亮整個場景。地面上覆蓋著落葉，整體氛圍是平和而寧靜的。光線柔和而溫暖，營造出一種寧靜的氛圍。圖片質量高，風格逼真。
更詳細說明：一張從正面拍攝的逼真照片，描繪了一隻彭布魯克威爾士柯基犬站在日落時分的森林裡。圖片也呈現出一種寧靜而平和的氛圍。在圖片的中央，一隻小巧的棕白色彭布魯克犬站在森林小徑上，直接看著觀看者，全身可見。這隻狗表情快樂，吐著舌頭，脖子上戴著一條黑色的項圈。它的耳朵豎起，眼睛睜大，給人一種警覺的感覺。背景是高大的黑暗樹木，陽光穿過樹木，營造出一種柔和、空靈的光芒。地面上覆蓋著秋葉，增添了寧靜的氛圍。整體氛圍是寧靜和安詳。

原文如下

caption: A dog standing in the middle of a forest.
detailed caption: The image shows a brown and white Pembroke Welsh Corgi standing in the middle of a forest, surrounded by trees.
more detailed caption: The image is a photograph of a dog standing in a forest. The dog is a Pembroke Welsh Corgi, with a brown and white coat and a black collar around its neck. It is standing on a bed of fallen leaves, with trees on either side of the dog. The background is blurred, but it appears to be a foggy forest with sunlight shining through the trees. The overall mood of the image is peaceful and serene.

caption: A dog standing in the middle of a forest.
detailed caption: In the center of the image we can see a dog. In the background there are trees. At the bottom there are leaves.
more detailed caption: A brown and white dog is standing on the ground. There are trees behind the dog. The dog has a black collar around its neck.

caption: A brown and white dog standing in a forest.
detailed caption: The image shows a Pembroke Welsh Corgi standing in the middle of a dark forest. The dog is white and brown in color and is wearing a black belt around its neck. The ground is covered with dried leaves, and the trees are shrouded in a thick fog, creating a mysterious atmosphere.
more detailed caption: The image is a close-up portrait of a Pembroke Welsh Corgi dog standing in the middle of a dense forest. The dog is facing the camera and is looking directly at the camera with its mouth slightly open, as if it is panting or panting. The sun is shining through the trees, casting a warm glow on the dog's fur. The trees in the background are tall and thin, with their trunks reaching up towards the sky. The ground is covered in fallen leaves, creating a dark and eerie atmosphere. The overall mood of the image is peaceful and serene.

caption: A pembroke welsh corgi dog standing in a forest during sunset
detailed caption: Photo of a Pembroke Welsh Corgi dog standing on a forest path, looking at the camera with a happy expression. The dog has a brown and white coat with a black collar around its neck. The background is a dense forest with sunlight filtering through the trees, casting a warm glow over the scene. The ground is covered in fallen leaves, and the overall mood is peaceful and serene. The lighting is soft and warm, creating a serene atmosphere. The image is high quality and has a realistic style.
more detailed caption: A photo-realistic shoot from a front camera angle about a pembroke welsh corgi dog standing in a forest during sunset. the image also shows a serene and peaceful atmosphere. on the middle of the image, a small, brown and white pembroken dog standing on a forest path, looking directly at the viewer with its full body visible. the dog has a happy expression, with its tongue out and its tongue hanging out, wearing a black collar around its neck. its ears are perked up, and its eyes are wide open, giving it a sense of alertness. the background features tall, dark trees with sunlight filtering through, creating a soft, ethereal glow. the ground is covered in autumn leaves, adding to the serene atmosphere. the overall mood is one of tranquility and serenity.